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Long-range dependence induced by heavy tails is a widely reported feature of internet traffic. 
Long-range dependence can be defined as the regular variation of the variance of the integrated 
process, and half the index of regular variation is then referred to as the Hurst index. Lhe 
infinite-source Poisson process (a particular case of which is the M/G/oo queue) is a simple 
and popular model with this property, when the tail of the service time distribution is regularly 
varying. Lhe Hurst index of the infinite-source Poisson process is then related to the index of 
regular variation of the service times. In this paper, we present a wavelet-based estimator of 
the Hurst index of this process, when it is observed either continuously or discretely over an 
increasing time interval. Our estimator is shown to be consistent and robust to some form of 
non-stationarity. Its rate of convergence is investigated. 

Keywords: heavy tails; internet traffic; long-range dependence; Poisson point processes; 
semiparametric estimation; wavelets 

1. Introduction 

We consider the infinite-source Poisson process with random transmission rate defined 
by 

X(t) = J2Uel {te < t<tt+Ve} , i>0, (1.1) 

where the arrival times {ti}i>o are the points of a unit-rate homogeneous Poisson pro- 
cess on the positive half-line, independent of the initial conditions; and the durations 
and transmission rates {(r]i,Ui)} are independent and identically distributed random 
variables with values in (0,oo) x R and independent of the Poisson process and of the 
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initial conditions. This process was considered by Resnick and Rootzen [12] and Mikosch 
et al. [9], among others. The M/G/oo queue is a special case, for Ui = l. An important 
motivation for the infinite-source Poisson process is to model the instantaneous rate of 
the workload going though an internet link. Although overly simple models are generally 
not relevant for internet traffic at the packet level, it is generally admitted that rather 
simple models can be used for higher-level (the so-called flow level) traffic such as TCP 
or HTTP sessions, one of them being the infinite-source Poisson process (see Barakat et 
al. [1]). One way to empirically analyse internet traffic at the flow level using the infinite- 
source Poisson process would consist in retrieving all the variables {te,r]i, Ut\ involved in 
the observed traffic during a given period of time, but this would require the collection 
of all the relevant information in the packets headers (such as source and destination 
addresses) for the purpose of separating the aggregated workload into transmission rates 
at a pertinent level; see Dufficld et al. [3] for many insights into this problem. 

It is well known that heavy tails in the durations {rjk\ result in long-range dependence 
of the process X(t). Long-range dependence can be defined by the regular variation of the 
autocovariance of the process or more generally by the regular variation of the variance 
of the integrated process: 



where L is a slowly varying function at infinity and H > 1/2 is often refered to as the 
Hurst index of the process. For the infinite-source Poisson process, the Hurst index H 
is related to the tail index a of the durations by the relation H = (3 — a)/2. The long- 
range dependence property has motivated many empirical studies of internet traffic and 
theoretical ones concerning its impact on queuing (these questions are studied in the 
M/G/oo case in Parulekar and Makowki [10]). 

However, to the best of our knowledge, no statistical procedure to estimate H has 
been rigorously justified. It is the aim of this paper to propose an estimator of the 
Hurst index of the infinite-source Poisson process, and to derive its statistical properties. 
We propose to estimate H (or equivalently a) from a path of the process X(t) over 
a finite interval [0,T], observed either continuously or discretely. In practice this can 
be done by counting all the packets going through some point of the network and then 
collecting local traffic rate measurements. Our estimator is based on the so-called wavelet 
coefficients of a path. There is a wide literature on this methodology for estimating long- 
range dependence, starting as long ago as Wornell and Oppenhcim [13], but we are not 
aware of rigorous results for non-Gaussian or non-stable processes. The main contribution 
of this paper is thus the proof of the consistency of our estimator. We also investigate 
the rate of convergence of the estimator in the case a > 1. If the process is observed 
continuously, the rate of convergence is good. In the case of discrete observations, the 
rate is much smaller. Also, the choice of the tuning parameters of the estimators is much 
more restricted in the latter case, and practitioners should perhaps be aware of this; see 
Section 4.3 for details. 

The process X is formally defined in Section 2. We state our assumptions and, using a 
point-process representation of X, we establish some of its main properties. The wavelet 
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coefficients are defined and the scaling property of their variances is obtained in Section 3. 
The estimator is defined and its properties are established in Section 4. The Appendix 
contains technical lemmas. 



2. Basic properties of the model 
2.1. Assumptions 

We now introduce the complete assumption on the joint distribution of the transmissions 
rates and durations. 

Assumption 1. (i) The random vectors {(j],U), (j]e,Ue), £ € Z} are independent with 
common distribution v on (0, oo) x R and independent of the homogeneous Poisson point 
process on the real line with points {tfj-^ez such that ti < tt+i for all £ and <_i < < to. 

(ii) There exists a positive integer p* such that E[|[/| p ] < oo. 

(iii) There exist a real number a € (0, 2) and positive functions Lq, ■ ■ ■ , L p * slowly vary- 
ing at infinity such that, for all t> and p = 0, . . . ,p* , 

H p (t) := E[|C/|Pl {l;>t} ] = L p (t)t~ a . (2.1) 

Since ?/ > 0, the functions H p are continuous at zero and H p (Q) = K[\U\ P ]. Condi- 
tion (2.1) is equivalent to saying that the functions H p ,p = 0, 1, . . . ,p*, are regularly 
varying with index —a. If a > 1 and p* > 2, Assumption 1 and Karamata's theorem 
imply the following asymptotic equivalence: 



nu 2 {v-t}+}=^ 



/ t {v<v} dv = H 2 {v)dv^—-L 2 (t)t 1 - a . (2.2) 

Jv=t Jv=t a L 



Remark 2. 1 . Assumption 1 will be used with p* = 2 to prove the regular variation of 
the autocovariance function of the process X and with p* = 4 to prove consistency of 
our estimators. It can be related to the theory of multivariate regular variation (see, for 
instance, Maulik et al. [7]). But the definitions of multivariate regular variation involve 
vague convergence and do not necessarily ensure the convergence of moments required 
here. 

Remark 2.2. We do not assume that U is non- negative. This allows us to consider 
applications other than teletraffic modelling. For instance, the process X could be used 
to model the volatiliy of some financial time series. 



Remark 2.3. We will often have to separate the cases E[ry] = oo and E[n] < oo. These 
cases are respectively implied by a < 1 and a > 1. If a — 1, the finitencss of E[r/] depends 
on the precise behaviour of Lq at infinity. 
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Example 2. 1 . Assumption 1 implies in particular that the tail of the distribution of rj 
is regularly varying with index a. This in turns implies Assumption 1 if U and rj are 
independent and E[|(7| p ] < oo, in which case the functions L p differ by a multiplicative 
constant. 



Example 2.2. Assumption 1 also holds in the following case which is of interest in 
tclctraffic modelling. In a TCP/IP traffic context, rj and U represent respectively the 
duration of a download session and its intensity (bit rate). Then W := Ur\ represents the 
amount of transmitted data. We assume that, for some uo > 0, there exist two regimes, 
U > ito (xDSL/LAN/cable connection) and U £ (0, uq) (RTC connection), such that the 
following statements hold: 

• The distribution of W given U = u > u is heavy-tailed and independent of u: V(W > 
w\U = u) = L(w)w~ a . 

• The distribution of W given U = u £ (0,uo) is light-tailed uniformly with re- 
spect to u. For instance, we assume cxpontially decaying tails, P(W > w\U = u) < 
exp(— /?ui~ 7 ), for some (3 > and 7 > 0. 

An explicit example for two such regimes is obtained when the conditional density of W 
given U = u is equal to aw" Q " 1 l{„,> 1 } if u > uq and exp(— w) if u < uq. 
Concerning the distribution of U we only assume that: 

• F(U>u o )>0, E[\U\- a - e ] <oo for some e > 0, and E[\U\ P '} < 00. 
Then (2.1) holds for p<p*. Indeed, 

E[[/ p l {l)>t} l {[/ > Uo} ] = E[U p t {w>ut] l {u > Uo} ] 

= nU p L(Ut)(Ut)- a t {u > Uo} ] 

= L(t)t- a E[UP- a L(Ut)/L(t)l {u > Uo} }. (2.3) 

Since L is slowly varying at infinity, lim^oo L(ut) / L(t) — 1, uniformly with respect to u 
in compact sets of (0, +00), and there exists to > such that, for u>u Q , t>t , 

^ (i + «>«-; 

see, for example, Resnick ([11], Proposition 0.8). Then, by the dominated convergence 
theorem, 

f limE[f/P- a I(W)/L(t)l {[ ,>„ o) ].E[^ a l {[ ,> ao} ]. (2.4) 

Consider now the low-bit-rate regime. Since, for all x > 0, exp{— /3x 7 } < Gx~ a ~ e for 
some positive constant C, we have 

E[un {v>t} t {u<uo} }<E[up G xp{^p(uty}t {u<uo} } < ct- a -^E[up- a - e i {u<uo} }. 
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Using the assumption on U, since p > 0, the rightmost expectation in the previous display 
is finite and we obtain that 

lim t a L~ 1 (t)E[un {v>t} t {u<Ua} }=0. 

Together with (2.3) and (2.4), this implies that, as t -> oo, E[U p t {n>t} }t a ~ L(t)E[U p - a x 
1{(7>M }] hence is slowly varying and Assumption 1 holds. 

2.2. Point-process representation and stationary version 

Let A" denote a Poisson point process on a set E endowed with a er-field £ with inten- 
sity measure that is, a random measure such that for any disjoint Ax,...,A p in £, 
Af(Ai), . . . ,Af(A p ) are independent random variables with Poisson law with respective 
parameters /i(A,), i = 1, . . . ,p. The main property of Poisson point processes that we will 
use is the following cumulant formula (see, for instance, Resnick [11]: Chapter 3). For 
any positive integer p and functions fi,...,f p such that J \fi\dfi < oo and J \fi\ p dpt < oo 
for all i = 1, . . . ,p, the pth-order joint cumulant of Af(fi), . . . ,Af(f p ) exists and is given 

by 

cum(Af(h),...,Af(fp)) = J h—fpdfi. (2.5) 

Let Ns be the point processes on K x (0,oo) x R with points (te,i]e,Ue)eez, that is 
Ns = J2e£Z^ti,ve,Ue- Under Assumption l(i), it is a Poisson point process with intensity 
measure Leb <g) v, where Leb is the Lebesgue measure on R. For t,u € R, define 

A t = {(s,v) 6Rxl + | s<t<s + v}, 
J3„ = {Au|Ae[l,oo)}. 

We can now show that if IE [77] < 00, then one can define a stationary version for X and 
provide its second-order properties. 

Proposition 2.1. If Assumption l(i) holds and E[r)] <oo, then the process 

X s (t) = J2Uel {t (2.6) 

is well defined and strictly stationary. It has the point-process representation 

poo 

X s (t)= N s (A t xB u )du- N s {A t xB u )du. (2.7) 

JO J -oo 

Let Kq = sup{<? > I t-a + r/-e > 0}, Ut = U-e and fjt = r\-t + t-£. Then, for all t>0, 

Ko 

X s (t) = J2Uel {t <M+ x ( t )- ( 2 - 8 ) 
1=1 



478 



G. Fay, F. Roueff and P. Soulier 



If, moreover, p* > 2, then X$ has finite variance and 

M[X s {t)]=E[Urj\, 

/OO 
H 2 (v)dv. 

Remark 2-4- Note that if a > 1, then E[rj] < oo and, by Karamata's theorem, 

cav(X s (0),Xs(t)) ) —L 2 (t)t 1 - a , t -> +oo. 

a — 1 

Proof. The number of non- vanishing terms in the sum (2.6) is Ns{A t x R) and has 
a Poisson distribution with mean E f R lA t (s,rj)ds = E[rj]. Thus Xs is well defined and 
stationary since N$ is stationary. The number of indices £ > such that t-e + rj-g > is 
Ns{Aq x R), hence if Kq is the largest of those is, it is almost surely finite and 

A'o 

i<t<t e +rie} — t<r?f}' 
tc<0 f=l 

Hence (2.8). 

The point-process representation (2.7) and formulc (2.5) and (2.2) finally yield the 
given expressions for the mean and covariance. □ 

Relation (2.8) shows that the stationary version Xs can be defined by changing the 
initial condition of the system. More generally, one could consider any initial conditions, 
that is, any process defined as on the right-hand side of (2.8) with K and fji,£ > 
finite. Since the initial conditions almost surely vanish after a finite period, they have a 
negligible impact on the estimation procedure. Thus, our result on X easily generalizes 
to any such initial conditions, and, in particular, to the stationary version Xs, when it 
exists. 

Applying similar arguments as those used for showing Proposition 2.1, we obtain: 
Proposition 2.2. The process X admits a point-process representation 

/•oo />0 

X(t) = / N S (A+ x B u ) du — / N S (A+ x B u )du, (2.9) 

JO J -oo 

where Af = A t C\WL 2 + . 

If Assumption 1 holds with p* > 2, then the process X is non- stationary with expecta- 
tion and autocovariance function given, for s <t, by 

E[X(t)} = E[U(r}At)], 
cov(X(s),X(t))=E[U 2 {s-(t-r 1 ) + } + }= f H 2 (v)dv. 

Jt-S 
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By the uniform convergence theorem for slowly varying functions, the following asymp- 
totic equivalence of the covariance holds. For any a£ (0, 2) and alH > s > 0, as T — > oo, 

cov(X(Tt), X(Ts)) ~ CL 2 (T)T 1 - a (2.10) 

with (7= f" w~ Q dw. 

Jt—s 

In accordance with the notation in use in the context of long-memory processes, we 
can define the Hurst index of the process X as H = (3 — a)/2, because the variance of 
the process integrated between and T increases as T 2H . If a < 1, then H > 1. This case 
has been considered, for instance, by Resnick and Rootzen [12]. 

3. Wavelet coefficients 
3.1. Continuous observation 

Let ip be a bounded real- valued function with compact support in [0,M] and such that 

M 

<A(s)ds = 0. (3.1) 



o 

For integers j > and k £ Z, define 

iP jtk (s)=2-V 2 Tp(2-h-k). (3.2) 
The wavelet coefficients of the path are defined as 

/•CO 

dj.k= 4> jt k(s)X(s)ds (3.3) 
Jo 

(see, for example, Cohen [2]). Assume that a path of the process X is observed continu- 
ously between times and T. Since ijjj,k has support in [fc2 J , (fc + M)2 J ], the coefficients 
d,-,fc can be computed for all (j, k) such that T2~ J ' > M and fc = 0, 1, . . . , T2^ - M. 
According to Lemma A.l, one may define, for all j and k, 

4k=J2 U * f t+m ^,k(s)&s- (3.4) 

As stated in Lemma A.l, if E[rj\ < oo, we have k = J °° ipj t k(s)Xs(s) ds. Nevertheless, 
even if IE [77] = 00, the sequence of coefficients at a given scale j, {dj k , k £ Z}, is stationary. 
Moreover, the definition (3.4) yields: 

Lemma 3.1. Let Assumption 1 hold with p* > 2. We have 

E[d* ]=0, var(4 fc ) = £(2^)2( 2 -^, (3.5) 
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where 

/>oo />oo / />oo f pt+vz -1 ^2 \ 

£(z):=z a J J (j |y V(s)ds| diju; 2 i/(dv, dw) (3.6) 

is slowly varying as z — > oo . More precisely, we have the asymptotic equivalence 

C(z)^CcL2(z) as z — ► oo , (3-7) 

with Cc^aJ™ J^{J* +y ds} 2 d.T f-"- 1 di > 0. 

The proof of (3.5) is a straightforward application of (2.5), and the proof of the asymp- 
totic equivalence (3.7) is obtained by standard arguments on slowly varying functions. A 
detailed proof can be found in Fay et al. [4] . 

3.2. Wavelet coefficients in discrete time 

Let cf> be a bounded R — > R function with compact support included in [-M + 1, 1] and 
such that 

^2<p(t-k) = i, ten. (3.8) 

fcez 

Let denote the operator defined on the set of functions x : R — > R by 

I 4> [x](t)=^2x(k)<l>(t-k). (3.9) 

fcfEZ 

The wavelet coefficients of x are then defined as the wavelet coefficients of [x] . 

From a computational point of view, it is convenient to chose </> and ip to be the so-called 
father and mother wavelets of a multiresolution analysis; see, for instance, Meyer [8]. The 
simplest choice is to take 4> and tf> to be associated with the Haar system, in which case 
M = l, = l [oa) and = l[ ,i/2) -l[i/2,i)- 

If the process X is observed discretely, we denote its wavelet coefficients by 

dP k = J ^ ifc ( S )I^[X]( S )d S . (3.10) 

If we observe X(0), X(l), . . .,X(T— 1), for some positive integer T, we can compute d® k 
for all j, fc such that < k < 2~ j (T - M+ 1) - M. Roughly, for 2^ > T/M, no coefficients 
can be computed and if 2- 7 < T/M the number of computable wavelet coefficients at scale 
2~3 is of order T2-' J + 1-M for j and T large. 

Remark 3.1. Observe that the choice of time units is unimportant here. Indeed, in 
Assumption 1, changing the time units simply amounts to adapting the slowly varying 
functions L k and the rate of the arrival process {tk}- Clearly these adaptations do not 
modify our results since precise multiplicative constants are not considered. 
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3.3. Averaged observations 

We describe now a third observation scheme for which our results can easily be extended. 
Suppose that T is a positive integer and that we observe local averages of the trajectory 

pk+l r 

X{k):=J X{t)dt = J X(t)<j> H (t- k)dt, k = 0, 1, . . . ,T - 1, 

where 4>h '■= l[o,i] is the Haar wavelet. Let denote the operator on locally intcgrable 
functions x defined by 



I*M(*) = X)( f x(s)<p H (s-k)dsj 



fc)d* Wf-fc). 



For this observation scheme, as in Section 3.2, one may compute the wavelet coefficients 
of the function I<p[X] at all scale and location indices (j,k) such that < k < 2 _J (T — 
M + 1) — M. If 4> = <pH and ip is the Haar mother wavelet, ip = l[o,i/2) ~~ l[i/2,i) ■ then 
the wavelet coefficients of [X] are precisely the continuous wavelet coefficients defined 
in (3.3). For any other choice of </> and ip, this is no longer true. We will not treat this 
case, but all our results can be extended at the cost of further technicalities. 



4. Estimation 

Tail index estimation methods do not seem appropriate here for estimating the parameter 
a. Indeed, a is the tail index of the unobserved durations {?7fc}, whereas the observed 
process X(t) always has finite variance (E[|X(i)| p ] < oo if and only if E[|[/ p |] < oo and 
the marginal distribution of X(t) is Poisson if U = 1 almost surely). But as shown by 
Proposition 2.2, a is related to the second-order properties of the process: the coefficient 
H = (3 — a)/2 can be viewed as its Hurst index, that is, H governs the rate of decay of 
the autocovariance function of the process. Therefore it seems natural to use an estimator 
of the Hurst index. 



4.1. The estimator 

Lemma 3.1 provides the rationale for a minimum contrast estimator of a which is related 
to the local Whittle estimator; cf. Kiinsch [6]. Let dj t k denote the wavelet coefficients 
which arc actually available; these may be obtained from continuous-time (dj^ = dj.k) 
or discrete-time (d^fc = d® k ) observations. Let A be a set of indices (j,k) of available 
wavelet coefficients. Denote the mean scale index over A by 

5:= ik 53 j - 

Ci,fc)eA 
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The reduced local Whittle contrast function is 



V(j,fe)eA / 
The local Whittle estimator of a is then defined as 

a:=arg min W(a'). (4.2) 

a'e(0,2) 

In order to simplify the proof of our result, we henceforth take A to be of the form 

A = {(j, k);J Q <j<Ji,0<k<nj-l}, 

with J = max{j;2 J ' < (T - M + 1)/(M + 1)}, = 2 J ~- 7 and integers J and Ji such that 

< Jo < Jx < J (4.3) 

The sequence of integers J depends on T in such a way that 2 J x T. Note that the 
dependence of the sequences J, Jo, J\, rij etc. on T is suppressed in our notation. 

4.2. Consistency 

Our estimator is consistent in the potentially unstable case, that is when a is not assumed 
to be in (1, 2), provided that the assumptions on the functions <j) and ip are strengthened. 
We assume that 

poo 

si;(s)ds = 0, (4.4) 
and there exist constants a and b such that, for all t € R, 

^2kcj)(t-k)=a + bt. (4.5) 

fcez 

These conditions are not satisfied by the Haar wavelet, but hold for any Daubechics 
wavelets; see Cohen [2]. 

Theorem 4.1. Let Assumption 1 hold with p* > 4. Assume that Jo and J\ depend on 
T in such a way that 

lim Jo = lim ( J\ — Jo) = oo, (4.6) 

T— oo T^oo 

limsup J /J < l/a, (4.7) 
limsup Ji/J< 1/(2 -a). (4.8) 

T-*oo 
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Then a is a consistent estimator of a. Moreover if a G (1,2), then conditions (4-4)> (4-5) 
and (4-8) are not necessary for the same result to hold. 

Remark 4.1. Conditions (4.6), (4.7) and (4.8) are satisfied by the choice Jq = [J/2\ 
and Ji = [J/2 + log( J) J . 

Proof of Theorem 4.1. For clarity of notation, we denote Y^j = X)ji/ +i> A? : ~ 
{k : (J, k) £ A} and #Aj = nj. Elementary computations give 

6 = J Q + 2 + {J - J 1 )/(2 Jl - 7 ° - 1) (4.9) 

so that 5 — (Jq + 2) — > under (4.6). By Karamata's representation theorem, the slowly 
varying function C defined in (3.6) can be written as 

C(z) = c(l + r{z)) cxpj / ^ ds 



with c > and lim^oo ^(z) = hm z _» 0O r(^) = 0. Define Co(z) = cexpjjl s~ 1 £(s) ds}, 
r*(z) = sup z , >z |r(z')| and £*(z) = sup z , >z \£(z')\. The functions r* and arc non- 
increasing and tend to zero at infinity. We now introduce some notation that will be 
used throughout the proof: 



W(a') =log( ^2( a '~ a ^ nj C{2 3 ) \ +<nog(2)(2-o/), 

3 ' 

W (a') =log(^ {a '~ a)3 n J C (2^) + *log(2)(2 - a'), 



, A 2( a '- a ^ nj £ (23) . .. 2^ a '~ a ^n i L{2?) 



^.,2(«'-«)i'n i /£ (2J')' JV ; ' ^,2(«'-»)j'n j /£(2J')' 

Tlj — 1 

Wj = £(2^')2( 2 -^, A, = nf £ ivj 1 ^) 2 1 ' • 
£(o/) : X"'''"' :A - 



We have 

£ 3 2( Q '-^n,£o(2 J >(2J) 



W(a')- W (a')=log 1 + 



S i 2(«'-«Wn j £o(2^) 

Here the fraction inside the logarithm is bounded by r*(2 Jo ), thus, for J large enough, 

sup\W(a')-W (a')\<Cr*(2 Jo ). 
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Standard algebra yields 

W^a')=log2j2w j , (a')(j-S) 

3 

<( a ')=log 2 (2)^^-o(a')(i-E^'.o(a')i' 

3 ^ j' 

By Lemma A. 8, under (4.6), 

limW » = 0, lim<(a) = 2. 

1 — >oo 1 — >oc 

Thus, there exist 77 > and £ > such that 

liminf inf Wq{o!) > (. 

T — >oo a'(E(a — 77,0+77) 

This implies that, for large T and some positive constant c, 

W(a) - W(a) > Wo(a) log(2)(a - a) + c(a - a) 2 - 2r*(2 Jo ). (4.10) 
Since Wq(o) — > and |d — a\ < 2, this implies that, for all e > 0, 

limsupP((d - a) 2 > e) < limsupP(t¥(d) - W{a) > ce). (4.11) 

Write 

H^(a') = W{a) + log{l + E{a')}, 
W(a) - W{a) = W{a) - W(a) - log{l + E(a)} + log{l + E(a)} 

<2 sup \\og{l + E(a')}\. (4.12) 

«'e(o,2) 

Consistency will follow from (4.11) and (4.12) provided that we can prove that 
sup Q , £ ( 2 ) |-E(a')| = op(l). If a > 1, take e £ (0, (a — l)/2) such that limsup Jo/J < 
l/(a + e), which is possible by assumption (4.7). Define 

1/2 = { JiA[J/(a + e)}, i£a>l, (413) 
so that, for T large enough, Jo < J2 < Ji- Write 

J2 Ji 
E{a')= J2 w 3( a ') A 3+ Yl w j (a')A j =:E 1 (a')+E 2 (a'), 

jWo + l j=Ja+l 
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with the convention that = if J 2 = Ji. By Lemma A. 6, 



sup \E 1 (a')\=0 P (2-^ J ), 
*'e(o,2) 



(4.14) 



for some positive £i . Now treat E 2 for a > 1 and J 2 ~ [J / {a + e)]> J\. For all a' G (0,2), 
we have a 1 — a — 1 < — 2e. Since C is slowly varying, we obtain, for some positive constant 
C, for all j = J 2 + 1, J 2 + 2, . . . , J u to,- (a') < C2- £ ( J2 - J °). Using Lemma A.5, it follows 
that 



3 



sup |£ 2 (a') | 

!'£(0,2) 



< C(Ji - J 2 )2 



-£(J 2 -Jo) 



0(2 



(4.15) 



for some £ 2 > because lim sup Jq/J< l/(a + e). This concludes the proof. 



□ 



4.3. Rate of convergence in the stable case 

Theorem 4.2. Let Assumption 1 hold with a 6 (1,2) and p* =4. Assume, moreover, 
that L4 is bounded and that C(z) = c + 0(z^ 13 ) with c > and (3 > 0. 

If X is observed continuously on [0,T], that is, cl^fc =dj t k, then the rate of convergence 
in probability of a is T~ l /( 2 ' 9 + Q ) ; obtained for Jo = [J/ {2/3 + a}] and J\ = J . 

If X is observed at discrete time points 1,2, ... ,T, that is, dj.k = d® k , then the rate 
of convergence in probability of a is y-7/(27+<*) y^/j ^ = f3 A (2 — a), obtained for Jo = 
[J/{2j + a}\ and J x = J. 

Remark 4-2. Observe that the choice of Jo corresponding to the best rate for a depends 
both on the unknown smoothness parameter (3 and on the parameter a itself. The case 
of discrete observations is similar to that of continuous-time observations but with the 
smoothness parameter /3 replaced by 7 = (3 A (2 — a), resulting in a slower rate of con- 
vergence. This can be explained by the aliasing induced by the interpolation step (3.9). 
It is clear that these rates of convergence are the best possible for our estimator under 
the assumption on C, since this choice of Jo makes the squared bias and the variance of 
the same order of magnitude. However, to our knowledge, the best possible rate of con- 
vergence for the estimation of a under these observations schemes is an open question. 
In other words, whether our estimator is rate optimal remains unknown. 

The rate of convergence of our estimator is derived under assumptions on the func- 
tion C The following lemma allows us to check them through conditions on the joint 
distribution of (U,rj). 

Lemma 4.3. Let Assumption 1 hold. 

(i) If there exist positive constants c and (3 such that, as t— > 00, 



L 2 (t) = c + O{t- ), 
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then there exists a constant d such that, as z — ► oo, 

(0(z-% if[3<2-a, 
C(z) = c'+ 1 0(z a - 2 \ogz), if/3 = 2-a, (4.16) 
{0{z a ~ 2 ), if (3 > 2- a. 

(ii) If there exist positive constants c and [3 such that, as t — > 0, 

E[(7 2 {1 - cosfat)}] = c|t|- Q {l + 0(\tf)}, 

then there exists a constant c' such that, as z — > oo , 

C(z) = c' + 0{z-P), (4.17) 

provided that ijj belongs to the Sobolev space W^ a+/:l ^ 2 ~ 1 , that is, 

(l + |£|)(«+«- 2 |r(£)| 2 d£<oo, (4.18) 



where ip* denotes the Fourier transform ofip, 

^*(0 = / H^e-^dt. (4.19) 
Jo 

Example 4-1- Assume that r\ has a Pareto distribution, that is, P(rj > t) = (1 V t)~ a , 
and is independent of U . This corresponds to Lemma 4.3(i) with (3 = oo, and we can 
easily compute an exact expression for the 0(z a ~ 2 ) term: 

c + oE\U^_ za _ 2 + 2 

2 — a 

The best possible rate of convergence of a is thus T-(2-«)/(4- a ) ) re g ar di ess f tne ob- 
servation scheme. 



Example 4.2. Let a € (1,2) and suppose that r\ is the absolute value of a symmetric 
a-stable random variable. Then Assumption 1 holds, say, if U is independent of r\ and 
has sufficiently many finite moments, and 

E[cos(jjt)] = exp(-er|£| Q ) = 1 - a\t\ a + 0{\t\ 2a ). 

By Lemma 4.3, the best possible rate of convergence of a is thus p-TV ( 2 7+a) w ith j = a 
for continuous-time observations and 7 = 2 — a for discrete-time observations. 

In the following, we give a decomposition of the error valid under the assumption 

< lim inf — < lim sup — < 1 . 

T— *oo J T^oc J 
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Optimizing Jq in this decomposition will then give the result. We use the same notation 
as in the proof of Theorem 4.1 with Ji = J. We first give a first rough rate of convergence 
for a by adapting the proof of Theorem 4.1. Under the present assumptions, Cq(z) = c, 
which implies W{,(a) = 0, and r*{z) = O^-' 3 ) as z -> oo. Then, (4.10), (4.12), (4.14) and 
(4.15) yield 

(A-a) 2 = P (2-« J + 2-' 9Jo ). (4.20) 

Since a is consistent and a is an interior point of the parameter set, the first derivative 
of the contrast function vanishes at a with probability tending to one. Hence 



,- 2 (a-2)jj2 
2^(j.fc)gAJ Z Q j,fc 

By the definition of S, this yields 



<51og(2). 



0= £ (j-S)2(^dl k 
(j,fc)eA 

= ^ (i-5)2(«- 2 Wdf ))b + log(2)(a-a) £ ^ - <y)2< a - a >'d£ fc (4.21) 
(j,fc)eA U,k)eA 

for a random a between a and a. By the definition of Aj, (4.21) implies that 

E j -Cj--^)2- j ^)(1 + A j -) 
" a lo g 2E J i(i-^)2( a - a - ^£(2^(1+ A,) ' 

Denote the sum in the denominator by -D, and write 

£> = ^^J-5)2^£(2^)+^^J-^(2 J )2-'''(2( 5 -^-l)(l + A J ) 

+ X)iC7- 5)2-^(2^') Aj 

3 

=:S + R 1 +R 2 . 

Using Lemma A. 8 and (4.9), one easily obtains that S ~ 2 1 ~ Jo as oo. 

Using Lemma A. 5, and the fact that \a — a\ < \a — a\ = op{ J -2 ), one similarly obatins 
-Ri = op(2~' /o ). To bound i?2, we proceed as for bounding E(a!) in the proof of The- 
orem 4.1 (here with a' = a > 1): we write £^ = Ej=/ +i + Ej 7 =j,+i ano - a PPly Lem- 
mas A. 5 and A. 6 to obtain i?2 = op(2~ ,/o ). Hence, we finally obtain 

&-<*= ^{ECJ - 5 ) 2 ^ £ ( 23 ') + EC? - ^)2-^(2^)A i |{l + 0p (l)}. (4.22) 
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In (4.22), the terms inside the curly brackets are interpreted as a deterministic bias term 
and a stochastic fluctuation term. The bias is bounded as follows: 



2 Jo £>• - 5)2-> £(2>) = 2 Jo J2(j - S)2-i{C{^) - c) = 0(2"^). 



(4.23) 



In the case of continuous-time observations, that is, d^fc = d^j k or dj t k = dj t k, we have 



S)2-'C{V)ki =0 P {2 



-J/2+(a/2-l)J \ 



(4.24) 



Gathering this bound with (4.22) and (4.23), and setting J = J/ (2/3 + a) , yields the first 
claim of Theorem 4.2, that is, a - a = P (2^/( 2/3+Q )). 

We now prove (4.24). Define (3 3 = nj 1 EkU) 1 {vj 1 (d^) 2 - 1}. Then fa = Kj if d jlk = 
dj k . Since a > 1, Lemmas 3.1 and A. 2 yield, for some positive constant C, 



E[#]=varGS,)<C 



L^) ._j 
£ 2 (2J) 



(4.25) 



Since C is bounded away from zero and L4 is bounded by assumption, the ratio L4/C 2 
is also bounded. The Minkowski inequality then yields, for some constant C > 0, 



E 



1/2 



0(2 



-J/2+(a/2-l)Jo\ 



(4.26) 



— 1/2 

If dj t k = dj.k, we use (A. 3) in Lemma A. 5, and obtain E[|Aj — /3j\] < Crij for some 
constant C > 0. Hence, in this case, since —1/2 < a/2 — 1, 



E 



J2(j ~ 6)2-^(2^ - ft) < C2- J ' 2 Y J \l ~ 8\£(V)2- 
3 3 

= o(2- J/2+ia/2 - 1)Jo ). 



J/2 



(4.27) 



Inequalities (4.26) and (4.27) imply (4.24). 

We now briefly adapt the previous proof to the case of discrete observations. De- 
fine vf = E[(d|f ) 2 ]. Lemma A.4(iii) implies vf = vj + 0(1). Thus we have vf 
£ D (2J)2( 2 - a ^ and 

C D (z) = C(z) + 0(z a - 2 ) = c + 0(z-t), 
with 7 — (3 A (2 — a). Then, defining 

rij 

Af=nj l E{(^r 1 4 fe -i} ! 

k=l 
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we obtain that (4.22) still holds with Cd and A D replacing £ and A, respectively. 
Lemma A. 7 implies that Aj 3 has the same order of magnitude as Aj , so that the stochastic 
fluctuation term has the same order of magnitude as in the previous case. The difference 
comes from the bias term, which is 0(2~ 7J °). Thus, a — a = Op(2~ 7,/ ° + 2~ 7 ' /o ), and 
setting J = J / (27 + a) yields the second claim of Theorem 4.2. 

5. Concluding remarks 

In this work, we have proved the validity of a wavelet method for the estimation of the 
long-memory parameter of an infinite-source Poisson traffic model, either in a stable or 
in an unstable state, that is, when it does or does not converge to a stationary process. 
We have shown that a suitable choice of the scales in the estimator (see Remark 4.1) 
yields a consistent estimator in both situations, and checked that the estimator is robust 
to discrete data sampling. 

However, the study of the rates raises some questions concerning the optimality of 
this estimator. To draw a comparison, suppose that one directly observes the durations 
r)i,...,r) n of clients arriving at times t\, . . . ,t n in [0,T]. Then one can use the Hill es- 
timator for estimating the tail index a. Since T and n are asymptotically proportional 
and r)x,...,r) n are independent and identically distributed, the rates of this estimator 
are those derived in Hall and Welsh [5]. In particular, if rj has a Pareto distribution, 
then a parametric rate VT can be obtained. On the other hand, in the same situation, 
our wavelet estimator defined on the observations {X(t),t£ [0,T]} has a dramatically 
deteriorating rate for a close to 2. It remains to establish whether this discrepancy comes 
from the choice of the estimator or from the fact that the durations rjk are not directly 
observed. 

Finally, let us draw a practical conclusion from our study. Care precaution should be 
taken with the choice of the scales used in the estimation, as shown by the conditions on 
Jo and J\ . In particular, if only discrete observations are available, the best possible rate 
of convergence is obtained for a much larger value of Jo than if continous observations 
are available. Too small a value of Jo will induce an important bias for finite samples. 
Practitioners should be aware of this restriction and be careful in the interpretation of 
the results. These questions will be tackled numerically in a future work. 

Appendix: Technical results 

The following technical lemmas arc proved in Fay et al. [4] . 

Lemma A.l. Let Assumption 1 hold. Let f be a bounded measurable compactly supported 
function such that J f(s)ds = 0. Define 



/t+v 
f(s)ds. 
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Then f\f(t,v,w)\ p dtv(dv,dw) < oo, E[N s (f)] = and J °° X(s)f(s) ds = 
-%(/1r+xr + xr)- If, moreover, E[rj\ < oo, then N s (f) = J X s (s)f(s)ds. 

Lemma A. 2. Let Assumption 1 hold with p* > 4. Then, there exists a positive constant 
C > such that 

varf^(df fe ) 2 j < Cn{L 2 2 (^)2 ( - i ~ 2a ^ + L 4 {V)2^- a ^}. 

\k=0 J 

Note that the first term dominates for a < 1 and the second dominates for a > 1 . 

Lemma A. 3. Let Assumption 1 hold. Let f be a bounded measurable compactly supported 
function such that f f(s)ds = 0. Define 



f(t,v,w)=w g t , v (s)f(s)ds, f(t,v,w)=w h t , v (s)f(s)ds 



Then, for p = 1, . . . ,p* , J \f(t, v, w)\ p dt v( dv, dw) < oo, J \f(t, v, w)\ p dtv{ dv, dw) < oo, 
fI^[X}{s)f(s)ds = N s (f 1m + xM + xm), andE[N s {f)}=E[N s (f)}=0. If, moreover, Efo] < 
oo, then Ns(f) = J I [A S ]( S )/( S ) ds. 



Applying Lemma A. 3, we can extend the definition of gc^ in (3.10) to the case E[r)] 



oo by 



d^=N s ^, k ). 



(A.l) 



Lemma A. 4. (i) Let Assumption 1 hold with p* > 1 and a £ (0,2). Then E[d^®] = 
for all j > and k £ Z. 

(ii) Let Assumption 1 hold with p* > 2 and a £ (0, 2). TTien var(d^ k — dj®) is bounded 
uniformly for j £N and k £ Z. 

(iii) Let Assumption 1 hold with p* > 2 and a £ (1, 2). TVien | var(dj fc ) — var(dj^?)| is 
bounded uniformly for j £ N and fc € Z. 



Lemma A. 5. Let Assumption 1 hold with a £ (1, 2) and p* > 2. TTien 

sup E|Aj| = 0(1); 

0<j<J 



sup n _1/2 E 

n>l,j>0 



fc=0 



< oo. 



(A.2) 
(A.3) 



Lemma A. 6. Let Assumption 1 hold with a £ (0,2) and p* > 4. If a < 1/2, assume 
(44) and (4.5). 
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Let J* be a sequence depending on J such that limsup J* / J < (1/of) A (1/(2 — a)). 
Then, there exists e > such that 



sup 

u£S 



Op(2" eJ ) 



(A.4) 



where S is the set of sequences u = {uq, ■ ■ .) satisfying EjeN \ u o \ — 1- 



Lemma A. 7. Let Assumption 1 hold with p* > 4 and a € (1,2). Then, there exists a 
positive constant C > such that 



™ E(^) 2 )<CL,{V)n2^. 



(A.5) 



Lemma A. 8. Let p be a positive real and p' := (2 P — 1) . Let i* be a non-increasing 
function on [l,oo) such that lim s _ ) . 00 (s) = 0, and let I be a function on [l,oo) such 
that \£(s) \ <£*{s) for all se[l,oo). Define 



t(s) 

L{x) — ccxp{ I — ds \ and oj 



Then, as Jq — » oo and for any e > 0, 



Ji 



"53 = Jo + 1 + + 0(£* (2 J »))) + 0(J X (2 - e) J »- Jl ), 

i=./o+i 



(A.6) 



^ Ujj 2 = Jl + 2J (1 + p') + 2// 2 + 3p' + 1 + //0(f (2 Jo )) 

j=Jo + l 

+ 0(jf(2-e) J °- Jl ). 



(A.7) 
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